Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 7.315
Filtrar
1.
BMC Bioinformatics ; 25(1): 142, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38566005

RESUMO

BACKGROUND: The rapid advancement of new genomic sequencing technology has enabled the development of multi-omic single-cell sequencing assays. These assays profile multiple modalities in the same cell and can often yield new insights not revealed with a single modality. For example, Cellular Indexing of Transcriptomes and Epitopes by Sequencing (CITE-Seq) simultaneously profiles the RNA transcriptome and the surface protein expression. The surface protein markers in CITE-Seq can be used to identify cell populations similar to the iterative filtration process in flow cytometry, also called "gating", and is an essential step for downstream analyses and data interpretation. While several packages allow users to interactively gate cells, they often do not process multi-omic sequencing datasets and may require writing redundant code to specify gate boundaries. To streamline the gating process, we developed CITEViz which allows users to interactively gate cells in Seurat-processed CITE-Seq data. CITEViz can also visualize basic quality control (QC) metrics allowing for a rapid and holistic evaluation of CITE-Seq data. RESULTS: We applied CITEViz to a peripheral blood mononuclear cell CITE-Seq dataset and gated for several major blood cell populations (CD14 monocytes, CD4 T cells, CD8 T cells, NK cells, B cells, and platelets) using canonical surface protein markers. The visualization features of CITEViz were used to investigate cellular heterogeneity in CD14 and CD16-expressing monocytes and to detect differential numbers of detected antibodies per patient donor. These results highlight the utility of CITEViz to enable the robust classification of single cell populations. CONCLUSIONS: CITEViz is an R-Shiny app that standardizes the gating workflow in CITE-Seq data for efficient classification of cell populations. Its secondary function is to generate basic feature plots and QC figures specific to multi-omic data. The user interface and internal workflow of CITEViz uniquely work together to produce an organized workflow and sensible data structures for easy data retrieval. This package leverages the strengths of biologists and computational scientists to assess and analyze multi-omic single-cell datasets. In conclusion, CITEViz streamlines the flow cytometry gating workflow in CITE-Seq data to help facilitate novel hypothesis generation.


Assuntos
Leucócitos Mononucleares , Software , Humanos , Análise de Sequência de RNA/métodos , Fluxo de Trabalho , Citometria de Fluxo , Proteínas de Membrana , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos
2.
Wiley Interdiscip Rev RNA ; 15(2): e1842, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38605484

RESUMO

Spatial transcriptomics (ST) is featured by high-throughput gene expression profiling within their native cell and tissue context, offering a means to investigate gene regulatory networks in tissue microenvironment. In situ sequencing (ISS) is an imaging-based ST technology that simultaneously detects hundreds to thousands of genes at subcellular resolution. As a highly reproducible and robust technique, ISS has been widely adapted and undergone a series of technical iterations. As the interest in ISS-based spatial transcriptomic analysis grows, scalable and integrated data analysis workflows are needed to facilitate the applications of ISS in different research fields. This review presents the state-of-the-art bioinformatic toolkits for ISS data analysis, which covers the upstream and downstream analysis workflows, including image analysis, cell segmentation, clustering, functional enrichment, detection of spatially variable genes and cell clusters, spatial cell-cell interactions, and trajectory inference. To assist the community in choosing the right tools for their research, the application of each tool and its compatibility with ISS data are reviewed in detailed. Finally, future perspectives and challenges concerning how to integrate heterogeneous tools into a user-friendly analysis pipeline are discussed. This article is categorized under: RNA Methods > RNA Analyses In Vitro and In Silico.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , RNA , Análise Espacial
3.
NPJ Syst Biol Appl ; 10(1): 36, 2024 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-38580667

RESUMO

By profiling gene expression in individual cells, single-cell RNA-sequencing (scRNA-seq) can resolve cellular heterogeneity and cell-type gene expression dynamics. Its application to time-series samples can identify temporal gene programs active in different cell types, for example, immune cells' responses to viral infection. However, current scRNA-seq analysis has limitations. One is the low number of genes detected per cell. The second is insufficient replicates (often 1-2) due to high experimental cost. The third lies in the data analysis-treating individual cells as independent measurements leads to inflated statistics. To address these, we explore a new computational framework, specifically whether "metacells" constructed to maintain cellular heterogeneity within individual cell types (or clusters) can be used as "replicates" for increasing statistical rigor. Toward this, we applied SEACells to a time-series scRNA-seq dataset from peripheral blood mononuclear cells (PBMCs) after SARS-CoV-2 infection to construct metacells, and used them in maSigPro for quadratic regression to find significantly differentially expressed genes (DEGs) over time, followed by clustering expression velocity trends. We showed that such metacells retained greater expression variances and produced more biologically meaningful DEGs compared to either metacells generated randomly or from simple pseudobulk methods. More specifically, this approach correctly identified the known ISG15 interferon response program in almost all PBMC cell types and many DEGs enriched in the previously defined SARS-CoV-2 infection response pathway. It also uncovered additional and more cell type-specific temporal gene expression programs. Overall, our results demonstrate that the metacell-pseudoreplicate strategy could potentially overcome the limitation of 1-2 replicates.


Assuntos
COVID-19 , Perfilação da Expressão Gênica , Humanos , Perfilação da Expressão Gênica/métodos , Leucócitos Mononucleares/metabolismo , Análise de Sequência de RNA/métodos , COVID-19/genética , SARS-CoV-2
4.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38600665

RESUMO

Single-cell RNA sequencing (scRNA-seq) facilitates the study of cell type heterogeneity and the construction of cell atlas. However, due to its limitations, many genes may be detected to have zero expressions, i.e. dropout events, leading to bias in downstream analyses and hindering the identification and characterization of cell types and cell functions. Although many imputation methods have been developed, their performances are generally lower than expected across different kinds and dimensions of data and application scenarios. Therefore, developing an accurate and robust single-cell gene expression data imputation method is still essential. Considering to maintain the original cell-cell and gene-gene correlations and leverage bulk RNA sequencing (bulk RNA-seq) data information, we propose scINRB, a single-cell gene expression imputation method with network regularization and bulk RNA-seq data. scINRB adopts network-regularized non-negative matrix factorization to ensure that the imputed data maintains the cell-cell and gene-gene similarities and also approaches the gene average expression calculated from bulk RNA-seq data. To evaluate the performance, we test scINRB on simulated and experimental datasets and compare it with other commonly used imputation methods. The results show that scINRB recovers gene expression accurately even in the case of high dropout rates and dimensions, preserves cell-cell and gene-gene similarities and improves various downstream analyses including visualization, clustering and trajectory inference.


Assuntos
Algoritmos , Análise de Célula Única , RNA-Seq , Análise de Célula Única/métodos , Análise de Sequência de RNA/métodos , Análise por Conglomerados , Expressão Gênica , Perfilação da Expressão Gênica , Software
5.
Genome Biol ; 25(1): 94, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622708

RESUMO

Recent innovations in single-cell RNA-sequencing (scRNA-seq) provide the technology to investigate biological questions at cellular resolution. Pooling cells from multiple individuals has become a common strategy, and droplets can subsequently be assigned to a specific individual by leveraging their inherent genetic differences. An implicit challenge with scRNA-seq is the occurrence of doublets-droplets containing two or more cells. We develop Demuxafy, a framework to enhance donor assignment and doublet removal through the consensus intersection of multiple demultiplexing and doublet detecting methods. Demuxafy significantly improves droplet assignment by separating singlets from doublets and classifying the correct individual.


Assuntos
Análise de Célula Única , Humanos , Análise de Célula Única/métodos , Análise de Sequência de RNA/métodos
6.
Genome Biol ; 25(1): 96, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38622747

RESUMO

We present a non-parametric statistical method called TDEseq that takes full advantage of smoothing splines basis functions to account for the dependence of multiple time points in scRNA-seq studies, and uses hierarchical structure linear additive mixed models to model the correlated cells within an individual. As a result, TDEseq demonstrates powerful performance in identifying four potential temporal expression patterns within a specific cell type. Extensive simulation studies and the analysis of four published scRNA-seq datasets show that TDEseq can produce well-calibrated p-values and up to 20% power gain over the existing methods for detecting temporal gene expression patterns.


Assuntos
Perfilação da Expressão Gênica , Análise de Célula Única , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Simulação por Computador , Expressão Gênica
7.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38632952

RESUMO

Single-cell RNA sequencing (scRNA-seq) enables dissecting cellular heterogeneity in tissues, resulting in numerous biological discoveries. Various computational methods have been devised to delineate cell types by clustering scRNA-seq data, where clusters are often annotated using prior knowledge of marker genes. In addition to identifying pure cell types, several methods have been developed to identify cells undergoing state transitions, which often rely on prior clustering results. The present computational approaches predominantly investigate the local and first-order structures of scRNA-seq data using graph representations, while scRNA-seq data frequently display complex high-dimensional structures. Here, we introduce scGeom, a tool that exploits the multiscale and multidimensional structures in scRNA-seq data by analyzing the geometry and topology through curvature and persistent homology of both cell and gene networks. We demonstrate the utility of these structural features to reflect biological properties and functions in several applications, where we show that curvatures and topological signatures of cell and gene networks can help indicate transition cells and the differentiation potential of cells. We also illustrate that structural characteristics can improve the classification of cell types.


Assuntos
Algoritmos , Análise de Célula Única , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Transcriptoma , Análise por Conglomerados
8.
Genome Biol ; 25(1): 99, 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38637899

RESUMO

Spatial molecular data has transformed the study of disease microenvironments, though, larger datasets pose an analytics challenge prompting the direct adoption of single-cell RNA-sequencing tools including normalization methods. Here, we demonstrate that library size is associated with tissue structure and that normalizing these effects out using commonly applied scRNA-seq normalization methods will negatively affect spatial domain identification. Spatial data should not be specifically corrected for library size prior to analysis, and algorithms designed for scRNA-seq data should be adopted with caution.


Assuntos
Perfilação da Expressão Gênica , Análise de Célula Única , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Algoritmos , Biologia
9.
BMC Genomics ; 25(1): 402, 2024 Apr 24.
Artigo em Inglês | MEDLINE | ID: mdl-38658838

RESUMO

BACKGROUND: In recent years, Single-cell RNA sequencing (scRNA-seq) is increasingly accessible to researchers of many fields. However, interpreting its data demands proficiency in multiple programming languages and bioinformatic skills, which limited researchers, without such expertise, exploring information from scRNA-seq data. Therefore, there is a tremendous need to develop easy-to-use software, covering all the aspects of scRNA-seq data analysis. RESULTS: We proposed a clear analysis framework for scRNA-seq data, which emphasized the fundamental and crucial roles of cell identity annotation, abstracting the analysis process into three stages: upstream analysis, cell annotation and downstream analysis. The framework can equip researchers with a comprehensive understanding of the analysis procedure and facilitate effective data interpretation. Leveraging the developed framework, we engineered Shaoxia, an analysis platform designed to democratize scRNA-seq analysis by accelerating processing through high-performance computing capabilities and offering a user-friendly interface accessible even to wet-lab researchers without programming expertise. CONCLUSION: Shaoxia stands as a powerful and user-friendly open-source software for automated scRNA-seq analysis, offering comprehensive functionality for streamlined functional genomics studies. Shaoxia is freely accessible at http://www.shaoxia.cloud , and its source code is publicly available at https://github.com/WiedenWei/shaoxia .


Assuntos
Análise de Sequência de RNA , Análise de Célula Única , Software , Análise de Célula Única/métodos , Análise de Sequência de RNA/métodos , Internet , Humanos , Biologia Computacional/métodos , RNA-Seq/métodos , Interface Usuário-Computador
10.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38653489

RESUMO

There is a growing interest in inferring context specific gene regulatory networks from single-cell RNA sequencing (scRNA-seq) data. This involves identifying the regulatory relationships between transcription factors (TFs) and genes in individual cells, and then characterizing these relationships at the level of specific cell types or cell states. In this study, we introduce scGATE (single-cell gene regulatory gate) as a novel computational tool for inferring TF-gene interaction networks and reconstructing Boolean logic gates involving regulatory TFs using scRNA-seq data. In contrast to current Boolean models, scGATE eliminates the need for individual formulations and likelihood calculations for each Boolean rule (e.g. AND, OR, XOR). By employing a Bayesian framework, scGATE infers the Boolean rule after fitting the model to the data, resulting in significant reductions in time-complexities for logic-based studies. We have applied assay for transposase-accessible chromatin with sequencing (scATAC-seq) data and TF DNA binding motifs to filter out non-relevant TFs in gene regulations. By integrating single-cell clustering with these external cues, scGATE is able to infer context specific networks. The performance of scGATE is evaluated using synthetic and real single-cell multi-omics data from mouse tissues and human blood, demonstrating its superiority over existing tools for reconstructing TF-gene networks. Additionally, scGATE provides a flexible framework for understanding the complex combinatorial and cooperative relationships among TFs regulating target genes by inferring Boolean logic gates among them.


Assuntos
Redes Reguladoras de Genes , Análise de Célula Única , Fatores de Transcrição , Análise de Célula Única/métodos , Fatores de Transcrição/metabolismo , Fatores de Transcrição/genética , Animais , Camundongos , Biologia Computacional/métodos , Teorema de Bayes , Humanos , Algoritmos , Análise de Sequência de RNA/métodos , Regulação da Expressão Gênica , Multiômica
11.
Nat Commun ; 15(1): 3323, 2024 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-38637518

RESUMO

Direct RNA sequencing offers the possibility to simultaneously identify canonical bases and epi-transcriptomic modifications in each single RNA molecule. Thus far, the development of computational methods has been hampered by the lack of biologically realistic training data that carries modification labels at molecular resolution. Here, we report on the synthesis of such samples and the development of a bespoke algorithm, mAFiA (m6A Finding Algorithm), that accurately detects single m6A nucleotides in both synthetic RNAs and natural mRNA on single read level. Our approach uncovers distinct modification patterns in single molecules that would appear identical at the ensemble level. Compared to existing methods, mAFiA also demonstrates improved accuracy in measuring site-level m6A stoichiometry in biological samples.


Assuntos
Nucleotídeos , RNA , RNA/genética , RNA Mensageiro/genética , Sequência de Bases , Análise de Sequência de RNA/métodos
12.
STAR Protoc ; 5(1): 102926, 2024 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-38461412

RESUMO

Here, we present a protocol for the identification of differentially expressed genes through RNA sequencing analysis. Starting with FASTQ files from public datasets, this protocol leverages RumBall within a self-contained Docker system. We describe the steps for software setup, obtaining data, read mapping, sample normalization, statistical modeling, and gene ontology enrichment. We then detail procedures for interpreting results with plots and tables. RumBall internally utilizes popular tools, ensuring a comprehensive understanding of the analysis process.


Assuntos
Perfilação da Expressão Gênica , Software , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , RNA-Seq , Expressão Gênica
13.
PLoS One ; 19(3): e0299358, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38536877

RESUMO

Single-cell RNA sequencing (scRNA-seq) is a high-throughput experimental technique for studying gene expression at the single-cell level. As a key component of single-cell data analysis, differential expression analysis (DEA) serves as the foundation for all subsequent secondary studies. Despite the fact that biological replicates are of vital importance in DEA process, small biological replication is still common in sequencing experiment now, which may impose problems to current DEA methods. Therefore, it is necessary to conduct a thorough comparison of various DEA approaches under small biological replications. Here, we compare 6 performance metrics on both simulated and real scRNA-seq datasets to assess the adaptability of 8 DEA approaches, with a particular emphasis on how well they function under small biological replications. Our findings suggest that DEA algorithms extended from bulk RNA-seq are still competitive under small biological replicate conditions, whereas the newly developed method DEF-scRNA-seq which is based on information entropy offers significant advantages. Our research not only provides appropriate suggestions for selecting DEA methods under different conditions, but also emphasizes the application value of machine learning algorithms in this field.


Assuntos
COVID-19 , Software , Humanos , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , COVID-19/genética , Algoritmos , Análise de Célula Única/métodos , Análise por Conglomerados
14.
BMC Bioinformatics ; 25(1): 138, 2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38553675

RESUMO

Even though high-throughput transcriptome sequencing is routinely performed in many laboratories, computational analysis of such data remains a cumbersome process often executed manually, hence error-prone and lacking reproducibility. For corresponding data processing, we introduce Curare, an easy-to-use yet versatile workflow builder for analyzing high-throughput RNA-Seq data focusing on differential gene expression experiments. Data analysis with Curare is customizable and subdivided into preprocessing, quality control, mapping, and downstream analysis stages, providing multiple options for each step while ensuring the reproducibility of the workflow. For a fast and straightforward exploration and visualization of differential gene expression results, we provide the gene expression visualizer software GenExVis. GenExVis can create various charts and tables from simple gene expression tables and DESeq2 results without the requirement to upload data or install software packages. In combination, Curare and GenExVis provide a comprehensive software environment that supports the entire data analysis process, from the initial handling of raw RNA-Seq data to the final DGE analyses and result visualizations, thereby significantly easing data processing and subsequent interpretation.


Assuntos
Curare , RNA-Seq , Reprodutibilidade dos Testes , Análise de Sequência de RNA/métodos , Transcriptoma , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Perfilação da Expressão Gênica/métodos
15.
Genome Biol ; 25(1): 81, 2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38553769

RESUMO

The use of single-cell technologies for clinical applications requires disconnecting sampling from downstream processing steps. Early sample preservation can further increase robustness and reproducibility by avoiding artifacts introduced during specimen handling. We present FixNCut, a methodology for the reversible fixation of tissue followed by dissociation that overcomes current limitations. We applied FixNCut to human and mouse tissues to demonstrate the preservation of RNA integrity, sequencing library complexity, and cellular composition, while diminishing stress-related artifacts. Besides single-cell RNA sequencing, FixNCut is compatible with multiple single-cell and spatial technologies, making it a versatile tool for robust and flexible study designs.


Assuntos
Genômica , RNA , Humanos , Animais , Camundongos , Fixação de Tecidos/métodos , Reprodutibilidade dos Testes , Análise de Sequência de RNA/métodos , RNA/genética , Genômica/métodos , Análise de Célula Única/métodos
16.
Lab Chip ; 24(8): 2287-2297, 2024 Apr 16.
Artigo em Inglês | MEDLINE | ID: mdl-38506394

RESUMO

We introduce a simple integrated analysis method that links cellular phenotypic behaviour with single-cell RNA sequencing (scRNA-seq) by utilizing a combination of optical indices from cells and hydrogel beads. With our method, the combinations, referred to as joint colour codes, enable the link via matching the optical combinations measured by conventional epi-fluorescence microscopy with the concatenated DNA molecular barcodes created by cell-hydrogel bead pairs and sequenced by next-generation sequencing. We validated our approach by demonstrating an accurate link between the cell image and scRNA-seq with mixed species experiments, longitudinal cell tagging by electroporation and lipofection, and gene expression analysis. Furthermore, we extended our approach to multiplexed chemical transcriptomics, which enabled us to identify distinct phenotypic behaviours in HeLa cells treated with various concentrations of paclitaxel, and determine the corresponding gene regulation associated with the formation of a multipolar spindle.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Humanos , Células HeLa , Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Hidrogéis , Análise de Célula Única/métodos , Análise de Sequência de RNA/métodos
17.
Nat Commun ; 15(1): 2765, 2024 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-38553455

RESUMO

Single-cell technologies can measure the expression of thousands of molecular features in individual cells undergoing dynamic biological processes. While examining cells along a computationally-ordered pseudotime trajectory can reveal how changes in gene or protein expression impact cell fate, identifying such dynamic features is challenging due to the inherent noise in single-cell data. Here, we present DELVE, an unsupervised feature selection method for identifying a representative subset of molecular features which robustly recapitulate cellular trajectories. In contrast to previous work, DELVE uses a bottom-up approach to mitigate the effects of confounding sources of variation, and instead models cell states from dynamic gene or protein modules based on core regulatory complexes. Using simulations, single-cell RNA sequencing, and iterative immunofluorescence imaging data in the context of cell cycle and cellular differentiation, we demonstrate how DELVE selects features that better define cell-types and cell-type transitions. DELVE is available as an open-source python package: https://github.com/jranek/delve .


Assuntos
Perfilação da Expressão Gênica , Software , Perfilação da Expressão Gênica/métodos , Análise de Célula Única/métodos , Diferenciação Celular , Ciclo Celular/genética , Análise de Sequência de RNA/métodos
18.
Cell Rep Methods ; 4(3): 100733, 2024 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-38503288

RESUMO

Here, we present Anchored-fusion, a highly sensitive fusion gene detection tool. It anchors a gene of interest, which often involves driver fusion events, and recovers non-unique matches of short-read sequences that are typically filtered out by conventional algorithms. In addition, Anchored-fusion contains a module based on a deep learning hierarchical structure that incorporates self-distillation learning (hierarchical view learning and distillation [HVLD]), which effectively filters out false positive chimeric fragments generated during sequencing while maintaining true fusion genes. Anchored-fusion enables highly sensitive detection of fusion genes, thus allowing for application in cases with low sequencing depths. We benchmark Anchored-fusion under various conditions and found it outperformed other tools in detecting fusion events in simulated data, bulk RNA sequencing (bRNA-seq) data, and single-cell RNA sequencing (scRNA-seq) data. Our results demonstrate that Anchored-fusion can be a useful tool for fusion detection tasks in clinically relevant RNA-seq data and can be applied to investigate intratumor heterogeneity in scRNA-seq data.


Assuntos
Algoritmos , Software , RNA-Seq , Análise de Sequência de RNA/métodos , RNA/genética
19.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38487851

RESUMO

Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful tool for investigating cellular heterogeneity through high-throughput analysis of individual cells. Nevertheless, challenges arise from prevalent sequencing dropout events and noise effects, impacting subsequent analyses. Here, we introduce a novel algorithm, Single-cell Gene Importance Ranking (scGIR), which utilizes a single-cell gene correlation network to evaluate gene importance. The algorithm transforms single-cell sequencing data into a robust gene correlation network through statistical independence, with correlation edges weighted by gene expression levels. We then constructed a random walk model on the resulting weighted gene correlation network to rank the importance of genes. Our analysis of gene importance using PageRank algorithm across nine authentic scRNA-seq datasets indicates that scGIR can effectively surmount technical noise, enabling the identification of cell types and inference of developmental trajectories. We demonstrated that the edges of gene correlation, weighted by expression, play a critical role in enhancing the algorithm's performance. Our findings emphasize that scGIR outperforms in enhancing the clustering of cell subtypes, reverse identifying differentially expressed marker genes, and uncovering genes with potential differential importance. Overall, we proposed a promising method capable of extracting more information from single-cell RNA sequencing datasets, potentially shedding new lights on cellular processes and disease mechanisms.


Assuntos
Redes Reguladoras de Genes , Análise de Célula Única , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Algoritmos , Análise por Conglomerados , Perfilação da Expressão Gênica/métodos
20.
Bioinformatics ; 40(3)2024 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-38439545

RESUMO

MOTIVATION: Removal of batch effect between multiple datasets from different experimental platforms has become an urgent problem, since single-cell RNA sequencing (scRNA-seq) techniques developed rapidly. Although there have been some methods for this problem, most of them still face the challenge of under-correction or over-correction. Specifically, handling batch effect in highly nonlinear scRNA-seq data requires a more powerful model to address under-correction. In the meantime, some previous methods focus too much on removing difference between batches, which may disturb the biological signal heterogeneity of datasets generated from different experiments, thereby leading to over-correction. RESULTS: In this article, we propose a novel multi-layer adaptation autoencoder with dual-channel framework to address the under-correction and over-correction problems in batch effect removal, which is called BERMAD and can achieve better results of scRNA-seq data integration and joint analysis. First, we design a multi-layer adaptation architecture to model distribution difference between batches from different feature granularities. The distribution matching on various layers of autoencoder with different feature dimensions can result in more accurate batch correction outcome. Second, we propose a dual-channel framework, where the deep autoencoder processing each single dataset is independently trained. Hence, the heterogeneous information that is not shared between different batches can be retained more completely, which can alleviate over-correction. Comprehensive experiments on multiple scRNA-seq datasets demonstrate the effectiveness and superiority of our method over the state-of-the-art methods. AVAILABILITY AND IMPLEMENTATION: The code implemented in Python and the data used for experiments have been released on GitHub (https://github.com/zhanglabNKU/BERMAD) and Zenodo (https://zenodo.org/records/10695073) with detailed instructions.


Assuntos
Análise de Célula Única , Análise da Expressão Gênica de Célula Única , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos , Análise por Conglomerados
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...